-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Breakout deallocation calls into simpler smaller files #2070
Conversation
@islas This is so cool! Thanks for working on this! Does this affect compile time in any way? |
It only affects compile times as number of threads go up. For typical compilation with Using gfortran/gcc 34/MPI-enabled with ALL PR changes (#2070, #2069, #2068)
|
I tested code before and after this PR, and model produces identical results in my test. Also with 4 processors, the compile time is about 12 minutes! |
The regression test results: Test Type | Expected | Received | Failed
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor request to tidy up some whitespace; otherwise, this looks good to me.
@islas Should we add the generated |
I see that there is a
to the |
I think being as restrictive as possible would be good, so I'd favor
|
TYPE: enhancement
KEYWORDS: intel, compilation, llvm, memory
SOURCE: internal
DESCRIPTION OF CHANGES:
Problem:
The Intel oneAPI compilers (and others like nvhpc) struggle with some of the larger (15k+ lines of code) files within WRF. This causes intense memory usage that is not often available to the average user not in a resource-rich environment. This often limits compilation to single threaded if even possible or to a dedicated environment with enough memory if available. If neither of those is available to a user, they will be unable to use these configurations entirely.
Solution:
This PR focuses on the
deallocs.inc
sections of code used inmodule_domain
to reduce the include size to manageable levels. The include is instead broken out into many smaller files as external subroutines. The files are fully generated source code from the registry, with the calls to the subroutines also being generated as well. This also makes it relatively easy to change the number of files generated from a source code perspective. Build rules would need to be modified accordingly as seen in these changes.TESTS CONDUCTED:
Attached to this PR are plots of the respective effects of theses changes. Changes were tested with intel and gcc compilers, but only intel memory usage is shown as it exacerbates the memory usage issue.